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OPTIMAL TRANSPORT AND DYNAMICS OF 
EXPANDING CIRCLE MAPS ACTING ON 

MEASURES 

by 
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o 

I Abstract. — In this article we compute the derivative of the action on 

probabihty measures of a expanding circle map at its absolutely continu- 
ous invariant measure. The derivative is defined using optimal transport: 
we use the rigorous framework set up by N. Gigli to endow the space of 
measures with a kind of differential structure. 
• It turns out that 1 is an eigenvalue of infinite multiplicity of this 

I derivative, and we deduce that the absolutely continuous invariant mea- 

sure can be deformed in many ways into atomless, nearly invariant mea- 
sures. 

We also show that the action of standard self-covering maps on mea- 
sures has positive metric mean dimension. 



Q ' 1. Introduction 



The theory of optimal transport has drawn much attention in recent 
years. Its applications to geometry and PDEs have in particular been 
largely disseminated. In this paper, we would like to show its effective- 
ness in a dynamical context. We are interested in arguably the simplest 
dynamical system where the action on measures is significantly different 
from the action on points, namely expanding circle maps. 

Another goal of the paper is to examplify the rigorous differential 



structure defined by N. Gigli Gig09a|, for the simplest possible com- 



pact manifold. Note that one can use absolutely continuous curves to 
define the almost everywhere differentiability of maps, see in particular 



Gig09b where this method is applied to the exponential map. Other 



previous uses of variants of this manifold structure include the definition 
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of gradient flows, as in the pioneering [OttOl] and in |AGS08] . and 
of curvature, as in |Lot08j . But up to our knowledge, no example of 
explicit derivative of a measure-defined map at a given point had been 
computed. 

1.1. An important model example. — Let us first consider the 
usual degree d self-covering map of the circle = R/Z defined by 

= dx mod 1. 

It acts on the set <^^(S^) of Borel probability measures, endowed with 
the topology of weak convergence, by the push- forward map ^d#- 

A map like $rf can act by composition on the right on a function 
space (e.g. Sobolev spaces). The adjoint of this map is usually called 
a Perron-Frobenius operator or a transfer operator, and a great deal 
of effort has been made to understand these operators, especially their 
spectral properties (see for example [BalOO] ). One can consider 
as an analogue for possibly singular measures of the Perron-Frobenius 
operator of <l>d. 

As pointed out by the referee of a previous version of this paper, using 
the finite-to-one maps 

[Xi,...,Xn)^ -Ox^ H V -Oxr, 

n n 

it is easy to prove that is topologically transitive and has infinite 
topological entropy. To refine this last remark, we shall prove that 
has positive metric mean dimension (a metric dynamical invariant of 
infinite-entropy maps). 

Theorem 1.1. — For all integer d ^ 2 and all exponent p G [l,+oo) 
we have 

mdimM($d#, Wp) ^ p{d - 1) 
where Wp is the Wasserstein metric with cost | ■ p. 

The definition of Wasserstein metrics is given below; for the definiton 
of metric mean dimension and the proof of the above result, see Section 
[2j Except in this result, we shall only use the quadratic Wasserstein 
metric. 

Our main goal is to study the first-order dynamics of near the 
uniform measure A. The precise setting will be exposed latter; let us 
just give a few elements. The tangent space to ^{E>^) at a measure 
fi that is absolutely continuous with continuous density identifies with 
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the Hilbert space LKfi) of all vector fields f : §^ — ?■ M that are with 
respect to /i, and such that j vX = 0. More generally, if /i is atomless 
identifies with a Hilbert subspace L^fj,) of L'^{fi)- 
We have a kind of exponential map: exp^(f ) = fi + v := (Id + f 
Then we say that a map / acting on <^3^(§^) has Gateau derivative L at 
fj, if /(/i) has no atom and L : L^fi) — Ll{f{fj,)) is a continous linear 
operator such that for all v we have 

WU'if^ + tv)JXfi) + tLv) = oit). 

Our first differentiability result is the following. 

Theorem 1.2. — The map has a Gateaux derivative at X, equal to 
d times the Perron- Frohenius operator of^d acting on Lq{X). In partic- 
ular its spectrum is the disc of radius d and all numbers of modulus < d 
are eigenvalues with infinite multiplicity. 

This result is detailled as Theorem 14. II and Proposition 14.41 below. We 
shall also see that is not Frechet differentiable. 

1.2. General expanding maps. — The next step is to consider the 
action on measures of expanding circle maps. In Section [5l given a gen- 
eral expanding map $, we compute the derivative of at its unique 
absolutely continuous invariant measure (Theorem 15. II) . Instead of writ- 
ting down the expression here, let us simply state the following. 

Theorem 1.3. — //$ is a C"^ expanding circle map, has a Gateaux 
derivative at its unique invariant absolutely continuous measure pX, whose 
adjoint operator in L^pX) is m H- $'m o $. 

In particular this derivative is a multiple of the Perron-Fronenius op- 
erator (on Lq(pA)) only when $' is constant, that is when $ is a model 
map. Using general results in the spectral theory of transfert operator, 
it is however possible to prove that 1 is always an eigenvalue of infinite 
multiplicity, with continuous eigenfunctions. 

1.3. Nearly invariant measures. — The spectral study of Dx{^#) 
gives us large families of nearly invariant measures, with Lipschitz para- 
metrization. 
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Theorem 1.4- — For all integer n, there is a bi-Lipschitz embedding 
F : B"- — 7- ^{S^) mapping to the absolutely continuous invariant mea- 
sure p\ of $ such that for all a G B^, 

W($#(F(a)),F(a))=o(|a|). 

As a consequence, for all e > and all integer K there is a radius r > 
such that for all k ^ K and all a G -B"(0, r) the following holds: 

W($^#(F(a)),F(a)) ^e|a|. 

Here B^ denotes the unit Euclidean ball centered at and W is the 
quadratic Wasserstein distance (whose definition is recalled below). 

It is easy to construct invariant measures near the absolutely contin- 
uous one, for example supported on a union of periodic orbits. One can 
also consider convex sums {1 — a)pX + afi where fi is any invariant mea- 
sure and a -C 1. But note that the curves a H- (1 — a)pA + a/i need not 
be rectifiable, let alone Lipschitz. Bernoulli measures are also examples; 
they are singular, atomless, fully supported invariant measures of $d that 
can be arbitrary close to A. 

The nearly invariant measures above seem of a different nature, and 
a natural question is how regular they are. They are given by push- 
forwards of the uniform measure by continuous functions; for example in 
the model case a one parameter family is given by 

oo 

(Id + t^rf-^ cos(27r/-))#A 

where t G [0,e). This makes it easy to prove that almost all of them are 
atomless. 

Proposition 1.5. — If p is an atomless measure and v G L^{fi), for 
all but a countable number of values of t ^ [0, 1], the measure fi + tv = 
(Id + tv)^pi has no atom. 



In particular, with the notation of Theorem L^, for almost all a the 
measure F{a) has no atom. 

This leaves open the following, antagonist questions. 

Question 1. — Is the measure F{a) absolutely continuous for most, or 
at least some a ^ 0? 

Question 2. — Is the measure F [a) invariant for most, or at least some 
a 7^0? 
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The next natural questions, not adressed at all here, concerns the 
dynamical properties of the action on measures of higher dimensional 
hyperbolic dynamical systems like Anosov maps or flows, or of discon- 
tinuous systems like interval exchange maps. 

1.4. Recalls and notations. — The most convenient point of view 
here is to construct the circle as the quotient M/Z. We shall often and 
without notice write a real number x G [0, 1) to mean its image by the 
canonical projection. We proceed similarly for intervals of length less 
than 1. 

Recall that the push-forward of a measure is defined by = 
for all Borelian set A. 

For a detailled introduction on optimal transport, the interested reader 
can for example consult [Vil03] . Let us give an overview of the properties 
we shall need. Given an exponent p G [1, oo), if {X, d) is a general metric 
space, assumed to be polish (complete separable) to avoid mesurability 
issues and endowed with its Borel cr-algebra, its Wasserstein space 
is the set Wp{X) of probability measures fi on X whose p-th moment is 
finite: 



endowed with the following metric: given /i, G '^p{X) one sets 



where the infimum is over all probability measures 11 on X x X that 
projects to fi on the first factor and to u on the second one. Such a mea- 
sure is called a transport plan between fi and u, and is said to be optimal 
when it achieves the infimum. In this setting, an optimal transport plan 
always exist. Note that when X is compact, the set Wp{X) is equal to 
the set j3^{X) of all probability measures on X. 

The name "transport plan" is suggestive: it is a way to describe what 
amount of mass is transported from one region to another. 

The function Wp is a metric, called the (L^) Wasserstein metric, and 
when X is compact it induces the weak topology. We sometimes denote 
W2 simply by W- 




for some, hence all xq G X 
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2. Metric mean dimension 

Metric mean dimension is a metric invariant of dynamical systems 
introduced by Lindenstrauss and Weiss [LWOOj . that refines topological 
entropy for infinite-entropy systems. 

Let us briefiy recall the definitions. Given a map f : X ^ X acting 
on a metric space, for any n G N one defines a new metric on X by 

dr,{x, y) := max{c/(/^-(a;), f\y))\ ^ A; ^ n}. 

Given e > 0, one says that a subset of X is (n, e)-separated if y) ^ 
e whenever x ^ y & S. Denoting by A^(/, e, n) the maximal size of a 
(n, e)-separated set, the topological entropy of / is defined as 

hif) := limlimsup — ^ — — — ^. 

Note that this limit exists since limsup„_^_,_oo Mog A^(/, e, n) is nonin- 
creasing in e. The adjective "topological" is relevant since h{f) does not 
depend upon the distance on X, but only on the topology it defines. The 
topological entropy is in some sense a global measure of the dependance 
on initial condition of the considered dynamical system. The map $rf is 
a classical example, whose topological entropy is \ogd. 
Now, the metric mean dimension is 

mdimM(/, d) := lim inf lim sup ^ ' — ^. 

£^0 „_,+oo n|loge| 

It is zero as soon as topological entropy is finite. Note that this quantity 
does depend upon the metric; here we shall use Wp. Lindenstrauss and 
Weiss define the metric mean dimension using covering sets rather than 
separated sets, but this does not matter since their sizes are comparable. 

Let us prove Theorem 11.11 the metric mean dimension of is at 
least p{d — 1) when ^(S^) is endowed with the Wp metric. In another 
paper |KlolO] . we prove the same kind of result, replacing <^fi by any 
map having positive entropy. However Theorem 1 1.1 1 has a better constant 
and its proof is simpler. 

Proof of Theorem \l.l\ — To construct a large (ra, £:)-separated set, we 
proceed as follows: we start with the point 6o, and choose a e-separated 
set of its inverse images. Then we inductively choose e-separated sets of 
inverse images of each elements of the set previously defined. Doing this. 



CIRCLE EXPANDING MAPS 



7 



we need not control the distance between inverse images of two different 
elements. 

Let 3> 1 and a > be integers; e will be exponential in —k. Let 
be the set all G ^(§^) such that - 2-^ 1)) = and /i([0, l/d]) ^ 
1/2. These conditions are designed to bound from below the distances 
between the antecedents to be constructed: a given amount of mass 
(second condition) will have to travel a given distance (first condition). 

An element n G Ak decomposes as /i = /i/i + /it where [ih is supported 
on [0, 1 — d2~^] and /if is supported on (1 — d2~^, 1 — 2~^). Let ei, . . . , 
be the right inverses to $ defined onto [0, l/d), 2/d), . . . [{d — l)/d, 1) 
respectively. For all integer tuples £ = (£i, . . . ,£(i) such that £i ^ 2°^~^ 
and = 2°^ define 

/i, = ei#(£i2-"V/^ + /^i) + XI e.#(^.2-°V/.) 

(see figure [U that illustrates the case d = 2). It is a probability measure 
on hes in A^ and $d#(/i£) = /*• Moreover, if i' ^ i then any transport 
plan from /i£ to fii' has to move a mass at least 2~°'^~^ by a distance at 
least 2~^d~^. Therefore, 

Wp(/i£,/i£') ^^^'2-^("/P+l)-l/*'. 

/i/i Mt 
I 
I 




^£ (minimal t) 



Hi (large 



Figure 1. Construction of separated antecedents of a given measure. 
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Let e = (i~i2~'^'^"/^+^)~^/P and define S'„ inductively as follows. First, 
'S'o = {^o}- Given Sn C A^, Sn+i is the set of all /i^ constructed above, 
where n runs through Sn- 

By construction, Sn+i has at least C2°'^^'^~^^ times has many elements 
as Sn, for some constant C depending only on d. Then Sn has at least 
(jn2nak{d~i) gigjjients. Let /i, u be two distinct elements of Sn and m be 

<l>2#i^both 



the greatest index such that 7^ Since = 

exists and is at most n — 1. The measures /i' 



and v' 



lie in S'„_m and have the same image. Therefore, they are e-separated. 



This shows that Sn is 
It follows that 

n\ \og£\ 



-separated. 



C 



I log el 
a[d — 1 

^ + 1 
p 



+ 



log 2 



« + 1 
p 



loge| 



-(1+0(1)) + 0(1). 



In the case of a general e, we get the same bound on logA^ up to an 
additive term na{d — 1) log 2, so that 



mdimM($d#, Wp) ^ 



a{d 



+ 1 



By taking a — j- oo we get mdimjv/($rf#, Wp) ^ p(c? 



□ 



3. The first-order differential structure on measures 

In this section we give a short account on the work of Gigli |Gig09a] 
in the particular case of the circle. Note that considering the Wasserstein 
space of a Riemannian manifold as an infinite-dimensionnal Riemannian 
manifold dates back to the work of Otto [QttOl] . However, in many 
ways it stayed a formal view until the work of Gigli. 

3.1. Why bother with this setting? — Before getting started, let us 
explain why we do not simply use the natural afiine structure on ^(S^), 
the tangent space at a point simply consisting on signed measures having 
zero total mass. Similarly, one could consider simpler to just take the 
smooth functions of §^ as coordinates to define a smooth structure on 
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The first argument against these points of vue is that optimal trans- 
portation is about pushing mass, not (directly) about recording the vari- 
ation of density at each point. 

More important, these simple ideas would lead a path of the form 
It = t6x + {I — t)6y to be smooth. However, the Wasserstein distance 
between 7^ and 7^ has the order of ^/\t^^~s\, so that 74 is not rectifiable 
(it has infinite length)! This also holds, for example, for convex sums of 
measures with different supports. 

One could argue that the previous paths can be made Lipschitz by 
using Wi instead of W2, so let us give another argument: in the affine 
structure, the Lebesgue measure does not have a tangent space but only 
a tangent cone since X + tfi is not a positive measure for all small t unless 
/i <^ A. If one wants to consider singular measures in the same setting 
than regular ones, the W2 setting seems to be the right tool. 

Note that it will appear that the differential structure on ^(S^) de- 
pends not only on the differential structure of the circle, but also on 
its metric. This should not be considered surprising: in finite dimen- 
sion, the fact that the differential structures are defined independently 
of any reference to a metric comes from the equivalence of norms in 
Euclidean space: here, in infinite dimension, even the simple formula 
W(/(/i + tv), f{fi) + tDxfiv)) = o(t) involves a metric in a crucial way. 

One could also be surprised that this differential structure involving the 
metric of the circle could be preserved by expanding maps of non-constant 
derivative. This point shall be cleared in Section El see Proposition 15.21 
and the discussion before it. 



3.2. The exponential map. — Note that as is customary in these 
topics, by a geodesic we mean a non-constant globally minimizing geodesic 
segment or line, parametrized proportionaly to arc length. 

Given fi G ^{E>^), there are several equivalent ways to define its tan- 
gent space T^. In fact, has a vectorial structure only when /i is atom- 
less; otherwise it is only a tangent cone. Note that the atomless condition 
has to be replaced by a more intricate one in higher dimension. 

The most Riemannian way to construct is to use the exponential 
map. Let ^(TS^)^ be the set of probability measures on the tangent 
bundle T§^ that are mapped to fi by the canonical projection. 
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Given ^, C e ^(TS^)/,, one defines 

W^(e,C) = (inf / £{x,y)n{dxdy) 

where d is any metric whose restriction to the fibers is the riemannian 
distance (here the fibers are isometric to M), and the infimum is over 
transport plans 11 that are mapped to the identity (Id, Id)#/i by the 
canonical projection on §^ x S^. This means that we allow only to move 
the mass along the fibers. Equivalently, one can desintegrate ^ and ( 
along /i, writing ^ = / K^x) and ( = J (x^dx), with {^x)x&^ and 
(Cx)xGSi two families of probability measures on T^S^ ~ M uniquely de- 
fined up to sets of measure zero. Then one gets 

W;(e,C)= / W\^x,QKdx) 

where one integrates the squared Wasserstein metric defined with respect 
to the Riemannian metric, that is | ■ |. 

There is a natural cone structure on ^(TS^)^, extending the scalar 
multiplication on the tangent bundle: letting Dr be the dilation of ratio 
r along fibers, acting on TS^, one defines r ■ '■= {Dr)#^. 

The exponential map exp : TS^ — )■ now gives a map 

exp# : =^(T§i)^ ^ ^(§^). 

The point is that not for all ^ G <^(T§^)^, is there a. e > such that 
t I— exp_^(t-^) defines a geodesic of ^(S^) on [0, e). Consider for example 
fj, = X, and C, be defined by .^^ = 1. Then exp^(t ■ = A for all t: one 
rotates all the mass while letting it in place would be more efficient. 

The first definition is that is the closure in ^(TS^)^ of the subset 
of all ^ such that exp_^(t ■ ^) defines a geodesic for small enough t. 

3.3. Another definition of the tangent space. — Let us now give 
another definition, assuming /i is atomless. We denote by | ■ the 
norm defined by the measure /i, and by | ■ I2 the usual norm defined 
by the Lebesgue measure A. 

Given a smooth function / : §^ — )■ R, its gradient V/ : — > TS^ can 
be used to push /i to an element ^/ = (V/)#/i of ^(TS^)^. This element 
has the property that exp^(t ■ C,) = (Id + tC,f)^fi defines a geodesic for 
small enough t, with a time bound depending on V/ and not on fi. More 
precisely, the geodesicness holds as soon as no mass is moved a distance 
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more than 1/2, and no element of mass crosses another one, and these 
conditions translate to t(V/)'(a;) ^ —1 for all x. This is a particular case 
of Kantorovich duality, see for example |Vil09] . especially figure 5.2. 

Now, let L^lfi) be the set of all vector fields v G L'^ifJ') that are L^(/i)- 
approximable by gradient of smooth functions. Then the image of the 
map V H- (Id, defined on L^fj,) with value in ^(TS^)^ is precisely 
T^. In particular, this means that as soon as fi is atomless, the disin- 
tegration {^x)x of an element of writes = Sv{x) ioi some function 
V and yU- almost all x. Moreover, v is L^(/i)-approximable by gradient of 
smooth functions; note that amoung smooth vector fields, gradients are 
characterized by J V/A = 0. We shall freely identify the tangent space 
with I/o(a*) whenever has no atom. 

In the important case when fi = pX for some continuous density p, a 
vector field v G L'^ifi) is approximable by gradient of smooth functions if 
and only if J vX = 0. We get that in this case, can be identified with 
the set of functions f : §^ — t- M that are square-integrable with respect to 
fi and of mean zero with respect to A. When fi is the uniform measure, we 
write Lg instead of Ll{X). Note that if w G L'^{f^) has neither its negative 
part nor its positive part A-integrable, then it can be approximated in 
L^ifi) norm by gradient of smooth functions, and that if fi has not full 
support, then LKfi) = L'^{p). 

For simplicity, given v ^ E LKfi) ~ we shall denote exp_^(t ■ ^) 
by /i + tv. In other words, fj, + tv = (Id + tv)^fi. 

This point of view is convenient, in particular because the distance 
between exponential curves issued from /i can be estimated easily: 

W(Ai + tv,fi + tw) t\v - u'|l2(^). 

Note that when v is different iable, then by geodesicness for t small enough 
we have 

W(/i,/i + tt^) = t\v\LH^L) 

and not only an equivalent. This will prove useful in the next subsection 
where several measures and vector fields will be involved. 

3.4. Two properties. — We shall prove that the exponential map can 
be used to construct bi-Lipschitz embeddings of small, finite-dimensional 
balls into ^{E>^), then we shall study how the density of an absolutely 
continuous measure evolves when pushed by a small vector field. 

The following natural result shall be used in the proof of Theorem 11.41 
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Proposition 3.1. — Given fi G ^{E>^) and {vi, . . . ,Vn) continuous, 
linearly independent vector fields in L^lfi), there is an rj > such that 
the map B"-{0,r]) — > ^{Ei^) defined by E{a) = ji + Yl'^i'^i ^■^ bi-Lipschitz. 

The difficulty is only technical: we already know that E is bi-Lipschitz 
along rays and we need some uniformity in the distance estimates to prove 
the global bi-Lipschitzness. The continuity hypothesis is not satisfactory 
but is all we need in the sequel. 

Note that we did not assume that /i has no atom; when it has, L'^{^) 
(still defined as the closure in of gradients of smooth functions) is 

not the tangent cone T^^(S^) but only a part of it. Note that if u is a 
vector field of vanishing A-mean, (/i + tv)t still defines a geodesic as 
long as tv' ^ —1. 

Proof. — Let a, 6 G B"-. The plan (Id + ^ ajt;j, Id + ^ biVi)^\ transports 
E{a) to E{b) at a cost 

|^(ai - bi)vi ^ ^ \vi\ij \a - 

so that E is Lipschitz. 

Up to a linear change of coordinates, we assume that the Vi form an 
orthonormal family of L^fi). To bound the distance between E{a) and 
E{b) from below, we shall design a vector field v such that pushing E{a) 
by V gives a measure close to E{b). 

Choose e > such that for all i we have 

\x - y\ e ^ \vi{x) - Vi{y)\ !^ 



4v^' 

Assume moreover e < 1/8. 

Let Wi be gradient of smooth functions such that \vi — Wi\oc ^ £■ Let 
?7 > be small enough to ensure 2y/nri ^ 1 and w[ ^ —{Anrj)'^ fo all i. 

Fix a,b & B"-{0,ri) and introduce two maps defined by ipiy) = y + 
J^^i'^iiy) i^iy) = y + Yli'^i'^ihj)- Note that ifj' ^ 1/2 so that is a 
diffeomorphism and is 2-Lipschitz. Let v = Yli^i ~ O'ijVi o ip~^. 

On the ffist hand, given any y G S^, we have 

1 /2 

\i^{y) - ^{y)\ ^ kl {^{wi{y) - Vi{y)f^ ^ \a\^e 

so that 

\y — ip~^ilj{y)\ ^ 2y/n\a\e ^ e 



CIRCLE EXPANDING MAPS 



13 



and 



It follows that 



and therefore 
(1) 



V o 



'i - ai)Vi 



1,, 
^ - b-a 



where u could be any probability measure. We shall take = + ^i'^i 
Similarly, 

1/2 



1/2 



L2(/.) 



(2) 



On the other hand, we have 
W (^/i + ^ aiVi, /i + ^ biVt^ ^ W{i^, u + v) -W (^i^ + v,fi + '^ kvi 

Let w = Y2{bi — ai)wiOtp~^. We have — ti;|oo ^ — a|. In particular 
^ ||6 — a|. The choice of r] ensures that w' ^ —1, so that 

5 

W(t^, ly + w) = \w\l2(u) ^ o 1^ ~ '^1- 

o 

Since W(z^ + v,^ + w) ^ \v — w\cx, we get 



(3) 



W(z/,i/ + 5) ^ -\b-a\. 



Finally, since u + v = {ip + vip)^fi, ([T]) shows that 
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SO that 

W (ji + ^aiVi,fi + ^biVi^ ^ -\b - a\. 

□ 

Proposition 3.2. — Let p be a density and v : ^ be a 
vector field. Then fort G M small enough pX + tv is absolutely continuous 
and its density pt is continuous and satisfy 

pt{x) = p{x) - t{pv)'{x) + o{t) 

where the remainder term is independent of x. 

Proof. — Let t be small enough so that \(\+tv is a diffeomorphism. Then 
for all integrable function /, one has 

j f{x){pX + tv){dx) = j f{x){ld + tv)#{pX){dx) 

^ lfix + tvix))pix)dx 

= j f{y){j^^o{U + tvr\y)dy 

by a change of variable. It follows that 

Pt = — - — o(ld + tv)-^ 
^ 1 + tv' ^ ' 

= (p(l - tv')) o (Id - tv) + o(t) 
= p- t{p'v + v'p) + o{t) 

where the o{t) term depends upon p and v but is uniform in x. □ 

Note that the o{t) depends in particular on the moduli of continuity of 
v' and p' and need not be an O(t^) unless v and p are C^. 

4. First-order dynamics in the model case 

In this section we show that is (weakly) different iable at the point 
A. Its derivative is an explicit, simple endomorphism of a Hilbert space, 
and we shall give a brief study of its spectrum. 

Theorem 4-1- — Let '■ ^\ ^ ^% be the linear operator defined by 
^dv{x) = v{x/d) + v{{x + H h v{{x + d- l)/d). 
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Then is the derivative of $d# (^t A in the following sense: for all 
w G Lq ^ Tx, one has 



First, we recognize in a multiple of the Perron- Frobenius operator 
of that is the adjoint of the map m H- m o $, acting on the space 
Lq. Second, we only get a Gateaux derivative, when one would prefer a 
Frechet one, that is a formula of the kind 



However, we shall see that such a uniform bound does not hold. How- 
ever, one easily gets uniform remainder terms in restriction to any finite- 
dimensional subspace of Lq. 

4.1. Differentiability of — The main point to prove in the 

above theorem is the following estimate. 

Lemma 4-2. — Given a density p, vector fields vi, . . . ,Vn G L'^{pX) and 
positive numbers ai, . . . , adding up to 1, one has 



We could deduce this result from Proposition 13.21 but for the sake of 
diversity let us give a different proof, which is almost contained in Figure 



Proof. — We prove the case n = 2 since the general case can then be 
deduced by a straightforward induction. Let e be any positive number. 
Let p, Vi and V2 be a piecewise constant density and two piecewise con- 
stant vector fields that approximate p in norm and Vi and V2 in 
norm: \p — p\i ^ e"^ and \vi — Vi\L2(^px) ^ e. 

The measure ((Id+Wj) x (Id+'yi))#pA is a transport plan from p\+Vi to 
pX+Vi, whose cost is \vi—Vi\'^^2(^pxy This shows that W(pA+t'j, p\+Vi) ^ e. 
A transport plan H from pX to pA that lets the common mass in place 
and transports the rest in any way moves a mass ||p — p|i by a distance 
at most i, thus W(pA,pA) ^ 2"^/^e. Now (id + Vi,ld + Vi)^U is a 
transport plan from pA + Vi to pA + Vi with the same cost as H, so that 
W(pA + Vi, pA + Vi) ^ 2^''^^^e. It follows that 



W($d#(A + tt'),A + t^d(t;)) 



o{t). 



W{^dd^ + v),\ + ^,{v)) = o{\v\). 




m 
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for a constant C = 2 '^/^ + 1, and similarly 



W 



[p\ + ^ aitvi, p\ + ^ aitv^ ^ Cet. 



We can moreover assume that p and are constant on each interval 
of the form [i/k,{i + 1) /k) for some fixed k (depending upon p, vi, V2 
and e). 

To see what happens on such an interval J, temporarily denoting by 
p, Vi and V2 the values taken by the functions p and Vi on J, let us 
construct for t small enough an economic transport plan from (Id + 
t{aiVi + a2V2))#pX\i to ai(Id + tvi)#pX\i + a2(Id + tv2)#p\\i. If the 
intervals (Id + tvi){I) and (Id + tv2){I) meet, one can simply let the 
common mass in place and move at each side a mass aia2p\vi — V2\t by a 
distance at most \vi — V2\t (see figure [21 this is not optimal but sufficient 



for our purpose). 
t^p\vi - V2\^. 



This transport plan has a cost raia2p\vi — V2\ < 



tVi 
tV2 

t{aivi + a2V2) 



t\vi - W2I 



Ki(Id + toi)#A|/ + a2(Id + to2)#A|/ 



(Id + t{aivi + a2V2))#X\i 



Figure 2. The cost of this transport plan has the order of 
magnitude 

If the intervals (Id + tfi)(/) and (Id + tf2)(/) do not meet, then t\vi — 
V2\ ^ 1/k and simple translations give a transport plan with cost at most 

aip/kt^\vi — f2|^ + a2p/kt^\vi — V2\'^ ^ pt^\vi — V2f. 

By adding one such plan for each interval [i/k, {i + l)//c), we get a 
transport plan from {\d + t{aiVi + a2V2))#p\ to ai(Id + twi)#pA + a2(Id + 
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tv2)#pX whose cost is at most k\vi — ^^2li3(pA)^^- Note that even if the Vi 
are only L^, vi are bounded and therefore in L^{pX). Now we have 

W (pA + t{aivi + a2V2), ai(pA + tvi) + a2{pX + tv2)) ^ k^^'^\'^i-'^'2\^L^p\)t^ 
so that, for t small enough, 

W (pA + t{aiVi + a2V2), ai(pA + tvi) + a2(pA + tv2)) ^ et 
By triangular inequality, it follows that 

W (pA + t{aiVi + a2V2), ai(pA + tvi) + a2(pA + tv2)) ^ C'et 
for a constant C" = 2-^/^ ^ 3_ □ 

Proof of Theorem \4.1\ — Remark that 

$rf#(A + tv) =i (A + dtv{-/d)) + i (A + v{{- + 

+ --- + ^(A + rftt;((- + rf-l)/d)) 
and apply the preceding lemma. □ 

Let us prove that we cannot hope for the Frechet differentiability of 
<^d#- We only treat the case d = 2 for simplicity. 

Proposition 4-3. — For all positive e, there is a vector field f G Lq 
that satisfies the following: 

1- \v\2 ^ £, 

2. J^2V = so that A + =Sf2f = A, and 

3. W($2#(A + t;),A) ^ ce 

for some constant c independent of s and v. 

Proof — Let be a positive integer, to be precised later on. Let v be the 
piecewise affine map defined as follows (see figure [3]): v{x) = l/(4/c) — y 
when X = i/{2k) + y with y G [0,l/{2k)) and ^ i < k an integer, 
and v{x) = — l/(4/c) + y when x = i/{2k) +y with y G [0, l/(2/c)) and 
k ^ i < 2k. We have \v\l = (4A;)"^/3 so that taking k ^ ensures 
point [TJ Moreover, [2] is straightforward, and we have left to prove that k 
chosen with the order of gives [31 

On any small enough interval J, if w is an affine function of slope —1 
with a zero at the center of J, then A|/ + w is a Dirac mass at the center 
of / (each element of mass is moved to the center). If w has slope 1, then 
the mass moves in the other direction, and A|/ + w; is uniform of density 
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1/2 on the interval /' having the same center than I and twice as long. 
By combining these two observations, one deduces that 

k ^ 

fi := <I'2#(A + v) = 1/2A + ^ ^^i^- 

i=l 




Figure 3. The case k = 4. Up: the graph of v; middle: \ + v; 
down: $#(A + v). 

Each interval of the form Jj = [{i — 5/8) / k, {i — 3/8) / k) is given by A 
a mass l/(4/c). The discrete part of fi consists in a Dirac mass of weight 
l/{2k) at the center of each /«. Any transport plan from /i to A must 
therefore move a mass at least 1 / (4/c) from each of these Dirac masses to 
the outside of so that a total mass at least 1/4 has to move a distance 
at least l/{8k). From this it follows that W(A,/i) ^ l/{16k). When k is 
chosen with the order of e~^, this distance has at least the order of e, as 
required. □ 

4.2. Spectral study of Jtfd- — Let us compute the spectrum of = 
Dx{^ci#). The following proposition is very elementary and not new, but 
we produce a proof for the sake of completeness. 

Proposition 4-4- — ^ number a is an eigenvalue of if and only if 
I a I < d. Moreover, each eigenvalue has an infinite- dimensional eigen- 
space. Last, the spectrum of ^d is the closed disc of radius 2. 

The proof of Proposition 14.41 consist simply in using Fourier series 
to show that (up to a multiplicative constant) ^d is conjugated to a 
countable product of the shift on £^(N). 
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Proof. — Let Ck denote the function x h-> cos(27rA;x) defined on the circle, 
and Sfc : X (— sin(27rA;x). Then it is readily checked that ^dCk = ^d^k = 
when d does not divide k, and ^dC^ = dck/d, -^dSk = dsk/d when d\k. 

Let a be the shift of the Hilbert space £^ = £^(N) of N-indexed square 
integrable sequences: if a; = {xq,Xi,X2, ■ ■ ■) then ax = {xi,X2,X3, . . .). 
Let be the direct product of a, acting diagonaly on the space 
of sequences X = {xf,x},x^, ■ ■ ■) such that G i"^ and ^ |x*|2 < oo. 
Then the map ^ : ^ ^2 defined by 

^(^) = xf'^~^^'c^di+l)dJ + xf'^'^^'^'^ qdi+2)di 

I I 2{d-l)i+d-2 

' ' Xj C(^di+d-l)d3 

2{d-l)i+d-l 2{d-l)i+d 
+ Xj S(^di+l)d;) + Xj S(^di+2)d] 

, , 2(d-l)i+2d-3 

is an isomorphism (and even an isometry) that intertwins cr^ and \.^d- 
The spectral study of therefore reduces to that of a. 

A non-zero eigenvector of cr, associated to an eigenvalue a, must have 
the form (x, ax, o?x, . . .) with x 7^ 0. Such a sequence is square integrable 
if and only if |a| < L Moreover the operator norm of a is 1, so that its 
complex spectrum is a subset of the closed unit disc. Since the spectrum 
is closed, and contains the set of eigenvalues, it is equal to the closed unit 
disc. □ 



4.3. Discussion of the non-Prechet differentiability. — The coun- 
ter-example to the Frechet differentiability of $^ at A has high total vari- 
ation, and it is likely that using a norm that controls variations (e.g. a 
Sobolev norm) on (a subspace of) T\ shall provide a uniform error bound. 

Moreover, up to multiplication by d the derivative S^d is the Perron- 
Frobenius operator of $d, and such operators have far more subtle spec- 
tral properties when defined over Sobolev spaces. 

For these two reasons, it seems that one could search for a modification 
of optimal transport that would give a manifold structure to ^(S^), in 
such a way that T\ identifies with a Sobolev space. A way to achieve 
this could be to penalize not only the distance by which a transport plan 
moves mass, but also the distorsion, that is the variation of the pairwise 
distances of the elements of mass. This should impose more regularity 
to optimal transport plans. 
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5. First-order dynamics for general expanding maps 

In this section, we consider a general map $ : — assumed to be 
and expanding, i.e. |<I>'| > 1. Such a map is a self- covering, and has a 
unique absolutely continuous invariant measure (see e.g. jKH95] ) which 
has a positive and density jKrz77] , denoted by p. The measure itself 
is denoted by p\. Note that as sets, L'^{pX) = L^, although they differ 
as Hilbert spaces. All integrals where the variable is implicit are with 
respect to the Lebesgue measure A. 

The result is as follows. 

Theorem 5.1. — The map has a Gateaux derivative S£ : L^pX) — t- 
Lo(/oA) at p\, given by 

Moreover the adjoint operator of ^ in Ll{pX) is given by 

££*u = $'no$. 

5.1. Proof of Theorem 15.11 — First, as in the case of Lemma 
14.21 shows that for v G Lq(pA), 

(4) d (^^ (pA + tv) , pA + t^v^ = o{t) 

where 

i/e<i'-i(x) ' 

is the first term in the expression of In words, each of the inverse 
image of x gives a contribution to the local displacement of mass that is 
proportional to vijj) and to p{y). 

This seems very similar to the case of except that =Sf need not 
map Lq{pX) to itself! Let us stress, once again, that the condition that 
V G Ll{pX) has mean zero is to be understood with respect to the uniform 
measure A, since it translates the metric property of being (close to) the 
gradient of a smooth function. This does not prevent Equation (jl]) to 
make sense, but shows that cannot be considered as the directional 
derivative of since it does not belong to Tp\ = Ll{pX). In fact, we 
shall see that there is another vector field, that lies in Lg(pA) and gives 
the same pushed measure (at least at order 1). 
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Proposition 5.2. — Given w G L'^{p\) and assuming that w is , 
there is aC^ vector field w G Lg(pA) such thatw{p^+tw, pX+tw) = o{t). 
Moreover, w is given by 

[ w 

w = w + 



Proof. — This is a direct application of Proposition 13.21 we search for a 
w such that (pw)' = (pw)', so that the densities pt and pt of p\ + tw and 
pX + tw are L°° and therefore close one to the other. This ensures 
that W(pA + tw, pX + tw) ^ \pt — Pt\ = o(t). 

But there exists exactly one vector field w that is C^, has mean zero, 
and such that (pw)' = (pw)': it is given by the claimed formula. □ 

Note that we did not bother to prove the unicity of w: Gigli's con- 
struction shows that the first order perturbation of the measure (with 
respect to the Wasserstein metric) characterizes a tangent vector in 
T^, see Theorem 5.5 in Gig09a| . 



Now if one considers the "centering" operator ^ : L^(pA) L^^pX) 
defined by 

I- 



the derivative of at pX is given by the composition Indeed, 
the previous proposition shows this for a argument, but vector 
fields are dense in Lq{pX) and the involved operators are continuous in 
the L'^ipX) topology. 

To get the expression of ^ given in Theorem 15. one only need a 
change of variable: denoting by (i = 1,2, ... ,d) the right inverses to 
$ that are onto intervals [ai = 0, 02), [a2, 03), . . . , [a^, a^+i = 1) one has 



P 



p o $ 

p 



p o $ 

The computation of the adjoint is a similar change of variable that we 
omit. Note that the adjoint of the extension to L^(pA) of =Sf (with the 
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same expression) is 

M H- $' M O $ 



$7« 



po$/l/p 

and the second term vanishes when u is in Lq{pX). The first term is also 
the adjoint in L'^{pX) of and this adjoint preserves Lq(pA). In other 
words, Jtf is the adjoint in Lq(pA) of the adjoint in L'^{p\) of =Sf. An 
interesting feature of the expression of ^* is that it does not involve the 
invariant measure. 

5.2. Spectral study. — Even if is not a multiple of the Perron- 
Frobenius operator of its first term ^ is a weighted transfert operator, 
with weight g = According to Theorem 2.5 in [BalOOj . every number 

of modulus less than Rg = lim„(sup =Sf"l)^/" is an eigenvalue of infinite 
multiplicity with continuous eigenf unctions. 

Proposition 5.3. — We have Rg ^ min$' > 1, and in consequence 
there is an infinite linearly independent family {vi)i of continuous func- 
tions in Lq{pX) such that = Vi. 

Proof. — Let m = min$': we have m > 1 and, since pX is invariant. 
It follows that for all positive continuous function /, 

p{y) 

p{x 



in particular, Rg ^ m > 1 and there is a linearly independent infinite 
family uo,Ui, . . . ,Ui . . . of continuous 1-eigenfunctions of =Sf . If not all 
have mean (with respect to Lebesgue's measure A), assume the mean 
of Mo is not zero and let Vi = Ui — aiUo where at is chosen such that 
J ViX = 0. Otherwise, simply put Vi = Ui. 

Now, since =Sff j = Vi and Vi has mean zero, we get = = 

Vi. □ 



In the same way, we see that all numbers less than m > 1 are eigen- 
values of ^ (with infinite multiplicity and continuous eigenfunctions). 
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6. Nearly invariant measures 

In this section we prove Theorem 11.41 and Proposition 11.51 

6.1. Construction. — Fix some positive integer n and let wi, . . . , w„ 

be continuous, linearly independent eigenf unctions for ^ = Dpx{^#). 

For all a = (ai, . . . , a^) G 5"(0, r/), define E{a) = pA+^. aiVi G ^(§^) 
and using Proposition 13. 1[ choose r] small enough to ensure that E is bi- 
Lipschitz. Then define F{a) = E{r]a) on the unit ball S". 

Proposition 6.1. — We have 

W($#(F(a)),F(a))=o(|a|) 

and, as a consequence, for all e > and all integer K , there is a radius 
r such that for all k ^ K and all a G i?"(0, c) the following holds: 

W($^#(F(a)),F(a)) ^e|a|. 

Proof. — Since we have restricted ourselves to a finite-dimensional space, 
we have W ($#(pA + riY^aiVi), p\ + 77 ^ Oj^Sf (wj)) = o(|a|) and, since 
^{vi) = Vi, we get W ($#(F(a)), F{a)) = o{\a\). 

The second inequality follows easily. The map is L-Lipschitz for 
some L > 1 {L = d in the model case, L > d otherwise). For all e > 
and for all integer K, let r > be small enough to ensure that 

|a| < 5 ^ W ($#(F(a)), F{a)) ^ -Azl_e|a|. 

Then 

k-l 

W($i(F(a)),F(a)) ^ E W (<^'#(i^(a)), ^™)) 

e=i 

k-l 

^ E^'"'w($.#(F(a)),F(a)) 

e=i 
^ e\a\. 

□ 

This ends the proof of Theorem II. 4[ It would be interesting to have 
explicit control on r in terms of e, n and K, and in particular to replace 
the o{\a\) by a 0(|a|") for some a > 1. This seems uneasy because, even 
in the model case where Vi are explicit, we can approximate them by C°° 
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vector fields Wi with a good control on {—w[) ^ and w' , but only bad 
bounds on w" (and therefore the modulus of continuity of w'). 

6.2. Regularity. — Let us prove that given /i an atomless measure 
and V G IvQ(/i) (or, indifferently, v G L^(yu)), for all but countably many 
values of the parameter t, the measure fi + tv has no atom. 

Proof of Proposition \1.5\ — By a line in TS^ ~ §^ x M, we mean the 
image of a non-horizontal line of by the quotient map (x, y) H- (x 
mod l,y). We sometimes refer to a line by an equation of one of its lifts 
in M^. 

The measure fi + tv has an atom at s if and only if the measure F = 
(ld,v)#fi defined on TE>^ gives a positive mass to the line {x + ty = s). 
Since n has no atom, neither does F, and since two lines intersect in a 
countable set, the intersection of two lines is F-negligible. It follows that 
there can be at most n different lines that are given a mass at least 1/n 
by F. In particular, at most countably many lines are given a positive 
mass by F, and the result follows. □ 

For a general vector field, we cannot hope for more. The following 
folklore example shows a Lq function such that A + is stranger to A for 
almost all t. 

Example 6.2. — Let K he a. four-corner Cantor set of M^. More pre- 
cisely, A, B, C, D are the vertices of a square, Sa, Sb, Sq, Sd are the ho- 
motheties of coefficient 1/4 centered at these points, and K is the unique 
fixed point of the map defined on compact sets M C by 

y{M) = Sa{M) U Sb{M) U Sc{M) U Sd{M). 

The Cantor set K projects on a well-chosen line to an interval, see figure 
m while in almost all directions it projects to A-negligible sets, see e.g. 
|PSS03] for a proof. Choose the square so that K projects vertically to 
[0, 1] (identified to and for x G [0, 1] define v{x) as the least y such 
that (x, y) G K. Then v is and, up to a vertical translation, we can 
even assume that f G Lq. But for almost all t, the measure A + is 
concentrated into a negligible set. 
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Figure 4. A square Cantor set that projects vertically to a 
segment, but projects in almost all directions to negligible sets. 
On the right, an approximation of the graph of the function v. 
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